Goto

Collaborating Authors

 resource allocation problem



Censored Semi-Bandits: A Framework for Resource Allocation with Censored Feedback

Arun Verma, Manjesh Hanawal, Arun Rajkumar, Raman Sankaran

Neural Information Processing Systems

The problem is challenging because the loss distribution and threshold value of each arm are unknown. We study this novel setting by establishing its'equivalence' to Multiple-Play Multi-Armed Bandits (MP-MAB) andCombinatorial Semi-Bandits.


LM4Opt-RA: A Multi-Candidate LLM Framework with Structured Ranking for Automating Network Resource Allocation

Ahmed, Tasnim, Rizwan, Siana, Ejaz, Naveed, Choudhury, Salimur

arXiv.org Artificial Intelligence

Building on advancements in Large Language Models (LLMs), we can tackle complex analytical and mathematical reasoning tasks requiring nuanced contextual understanding. A prime example of such complex tasks is modelling resource allocation optimization in networks, which extends beyond translating natural language inputs into mathematical equations or Linear Programming (LP), Integer Linear Programming (ILP), and Mixed-Integer Linear Programming (MILP) models. However, existing benchmarks and datasets cannot address the complexities of such problems with dynamic environments, interdependent variables, and heterogeneous constraints. To address this gap, we introduce NL4RA, a curated dataset comprising 50 resource allocation optimization problems formulated as LP, ILP, and MILP. We then evaluate the performance of well-known open-source LLMs with varying parameter counts. To enhance existing LLM based methods, we introduce LM4Opt RA, a multi candidate framework that applies diverse prompting strategies such as direct, few shot, and chain of thought, combined with a structured ranking mechanism to improve accuracy. We identified discrepancies between human judgments and automated scoring such as ROUGE, BLEU, or BERT scores. However, human evaluation is time-consuming and requires specialized expertise, making it impractical for a fully automated end-to-end framework. To quantify the difference between LLM-generated responses and ground truth, we introduce LLM-Assisted Mathematical Evaluation (LAME), an automated metric designed for mathematical formulations. Using LM4Opt-RA, Llama-3.1-70B achieved a LAME score of 0.8007, outperforming other models by a significant margin, followed closely by Llama-3.1-8B. While baseline LLMs demonstrate considerable promise, they still lag behind human expertise; our proposed method surpasses these baselines regarding LAME and other metrics.



Linear Multi-Resource Allocation with Semi-Bandit Feedback

Tor Lattimore, Koby Crammer, Csaba Szepesvari

Neural Information Processing Systems

We study an idealised sequential resource allocation problem. In each time step the learner chooses an allocation of several resource types between a number of tasks. Assigning more resources to a task increases the probability that it is completed. The problem is challenging because the alignment of the tasks to the resource types is unknown and the feedback is noisy. Our main contribution is the new setting and an algorithm with nearly-optimal regret analysis. Along the way we draw connections to the problem of minimising regret for stochastic linear bandits with heteroscedastic noise. We also present some new results for stochastic linear bandits on the hypercube that significantly improve on existing work, especially in the sparse case.


LLM-OptiRA: LLM-Driven Optimization of Resource Allocation for Non-Convex Problems in Wireless Communications

Peng, Xinyue, Liu, Yanming, Cang, Yihan, Cao, Chaoqun, Chen, Ming

arXiv.org Artificial Intelligence

Solving non-convex resource allocation problems poses significant challenges in wireless communication systems, often beyond the capability of traditional optimization techniques. To address this issue, we propose LLM-OptiRA, the first framework that leverages large language models (LLMs) to automatically detect and transform non-convex components into solvable forms, enabling fully automated resolution of non-convex resource allocation problems in wireless communication systems. LLM-OptiRA not only simplifies problem-solving by reducing reliance on expert knowledge, but also integrates error correction and feasibility validation mechanisms to ensure robustness. Experimental results show that LLM-OptiRA achieves an execution rate of 96% and a success rate of 80% on GPT-4, significantly outperforming baseline approaches in complex optimization tasks across diverse scenarios.


Towards Scalable O-RAN Resource Management: Graph-Augmented Proximal Policy Optimization

Ngo, Duc-Thinh, Piamrat, Kandaraj, Aouedi, Ons, Hassan, Thomas, Raipin-Parvédy, Philippe

arXiv.org Artificial Intelligence

Open Radio Access Network (O-RAN) architectures enable flexible, scalable, and cost-efficient mobile networks by disaggregating and virtualizing baseband functions. However, this flexibility introduces significant challenges for resource management, requiring joint optimization of functional split selection and virtualized unit placement under dynamic demands and complex topologies. Existing solutions often address these aspects separately or lack scalability in large and real-world scenarios. In this work, we propose a novel Graph-Augmented Proximal Policy Optimization (GPPO) framework that leverages Graph Neural Networks (GNNs) for topology-aware feature extraction and integrates action masking to efficiently navigate the combinatorial decision space. Our approach jointly optimizes functional split and placement decisions, capturing the full complexity of O-RAN resource allocation. Extensive experiments on both small-and large-scale O-RAN scenarios demonstrate that GPPO consistently outperforms state-of-the-art baselines, achieving up to 18% lower deployment cost and 25% higher reward in generalization tests, while maintaining perfect reliability. These results highlight the effectiveness and scalability of GPPO for practical O-RAN deployments.



GFlowNets for Active Learning Based Resource Allocation in Next Generation Wireless Networks

Chaaya, Charbel Bou, Bennis, Mehdi

arXiv.org Artificial Intelligence

--In this work, we consider the radio resource allocation problem in a wireless system with various integrated functionalities, such as communication, sensing and computing. We design suitable resource management techniques that can simultaneously cater to those heterogeneous requirements, and scale appropriately with the high-dimensional and discrete nature of the problem. We propose a novel active learning framework where resource allocation patterns are drawn sequentially, evaluated in the environment, and then used to iteratively update a surrogate model of the environment. Our method leverages a generative flow network (GFlowNet) to sample favorable solutions, as such models are trained to generate compositional objects proportionally to their training reward, hence providing an appropriate coverage of its modes. As such, GFlowNet generates diverse and high return resource management designs that update the surrogate model and swiftly discover suitable solutions. We provide simulation results showing that our method can allocate radio resources achieving 20% performance gains against benchmarks, while requiring less than half of the number of acquisition rounds.


Decision-centric fairness: Evaluation and optimization for resource allocation problems

De Vos, Simon, Van Belle, Jente, Algaba, Andres, Verbeke, Wouter, Verboven, Sam

arXiv.org Artificial Intelligence

Data-driven decision support tools play an increasingly central role in decision-making across various domains. In this work, we focus on binary classification models for predicting positive-outcome scores and deciding on resource allocation, e.g., credit scores for granting loans or churn propensity scores for targeting customers with a retention campaign. Such models may exhibit discriminatory behavior toward specific demographic groups through their predicted scores, potentially leading to unfair resource allocation. We focus on demographic parity as a fairness metric to compare the proportions of instances that are selected based on their positive outcome scores across groups. In this work, we propose a decision-centric fairness methodology that induces fairness only within the decision-making region -- the range of relevant decision thresholds on the score that may be used to decide on resource allocation -- as an alternative to a global fairness approach that seeks to enforce parity across the entire score distribution. By restricting the induction of fairness to the decision-making region, the proposed decision-centric approach avoids imposing overly restrictive constraints on the model, which may unnecessarily degrade the quality of the predicted scores. We empirically compare our approach to a global fairness approach on multiple (semi-synthetic) datasets to identify scenarios in which focusing on fairness where it truly matters, i.e., decision-centric fairness, proves beneficial.